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PRELIMINARY  RESULTS  FROM  THE  ANALYSIS  OF 


WIND  COMPONENT  ERROR 
by 

P.  A.  Jacobs  and  D.  P.  Gaver 


0.  INTRODUCTION 


Numerical  meteorological  models  are  used  to  assist  in  the  prediction  of 
weather.  Each  run  of  a  numerical  model  produces  forecasts  of 
meteorological  variables  which  are  used  as  preliminary  predictions  of  the 
future  values  of  these  variables.  These  initial  predictions  are  referred  to  as 
first-guess  values.  In  this  paper  first-guess  values  will  refer  to  the  most 
recent  12  hour  forecasts. 

In  certain  areas  of  the  world  observations  of  the  values  of  forecasted 
variables  become  available,  in  our  case  the  observations  become- available  12 
hours  after  the  first-guess  values  are  computed.  Prior  to  the  next  run  of  the 
numerical  model  a  multivariate  optimal  interpolation  analysis  updates  a 
first-guess  value  of  a  variable  by  adding  to  it  a  weighted  observed  value  of 
the  variable  if  it  is  available.  The  weight  multiplying  the  observed  value 
depends  on  estimates  of  the  squared  error  of  the  first-guess  value  and  the 
squared  error  of  the  observation;  cf.  Goerss  et  al.  [1991,  a,  b].  Thus  it  is  of 
importance  to  predict  such  first-guess  squared  errors. 

The  general  problem  of  modeling  and  predicting  mean  square  errors  is 
important  but  not  widely  studied;  see  Efron  '1986)  and  Jorgenson  (1987).  In 
the  next  section  statistical  models  for  the  error  of  the  first-guess  are 
introduced.  The  models  assume  the  error  of  the  first-guess  has  mean  0  but  has 


a  scale  parameter  that  is  log-linear  with  suitable  covariates,  i.e.  explanatory  or 


regression  variables. 


Results  are  reported  concerning  the  estimation  of  model  parameters,  and 
model  cross-validation  and  predictive  ability  for  u,  v  wind  component  data 
from  the  months  of  February  and  April  1991.  The  data  consist  of 
measurements  and  12  hour  forecasts  (first-guess  values)  from  93  stations  in 
North  America,  25N-75N.  The  forecasts  are  produced  using  the  NOGAPS 
Spectral  Forecast  Model;  cf.  Hogan  et  al.  Each  station  has  measurement  and 
first-guess  values  for  every  12  hours;  there  are  some  missing  observations. 
The  first-guess  values  are  subtracted  from  measurement  values  (if  available) 
to  obtain  observations  of  the  error  of  the  first-guess.  The  results  appear  in 
Sections  3  and  4  and  in  Appendices  B,  C  and  D. 

The  results  indicate  that  estimates  of  the  variance  of  the  error  of  first-guess 
wind  components  can  be  improved  by  using  covariates  which  are  functions 
of  the  wind  components.  Covariates  using  observed  values  of  the  wind 
components  appear  to  have  more  predictive  ability  than  those  using  first- 
guess  values.  Further  exploratory  work  is  needed  to  determine  the  degree 
with  which  these  statistical  results  can  be  used  to  improve  the  forecasting 
ability  of  the  numerical  model. 

1.  THE  MODELS 
Let 

Uo{t)  =  observed  «-wind  component  at  time  t 
Uf{t)  =  first-guess  u-wind  component  at  time  t 
VoiO  =  observed  i?-wind  component  at  time  t 
Vf{t)  =  first-guess  z;-wind  component  at  time  t 

r(l)  -  [(Uo(()  •  Uo((  - 1))^  t  (VoCO  -  Vo((  •  1))^]2 

1 

s(t)=[L/o(f)^  +  Vo(f)2]2 

Yit)-Uoit)-Ufit)  or  Y{t)  =  VQit)-Vf{t) 

The  models  considered  are  as  follows: 


1 


NORMAL  MODELS: 


One  Variable  Models 

1-  {V(0}  are  independent  normally  distributed  random  variables  with 

mean  0  and  variance 

erf  (l;f)  =  exp{ai(l)  +  )8,(l)r(0}.  (1) 

2.  {y(f)}  are  independent  normally  distributed  random  variables  with 
mean  0  and  variance 

of(2;()  =  exp{a,(2)+ft(2)s(()}.  (2) 

Two  Variable  Model 

3.  {y(0}  are  independent  normally  distributed  random  variables  with 
mean  0  and  variance 

a|(f)  =  exp{a+^ir(t)  +  /J2s(0}.  .  .  (3) 

CAUCHY  MODELS: 

While  many  measurement  errors  of  physical  quantities  are 
approximately  normal,  especially  "in  the  middle"  of  their  distribution,  there 
can  well  be  thicker-than-normal  tails  and  occasional  extreme  outliers.  These 
attributes  can  have  seriously  degrading  effects  in  regression-like  problems;  cf. 
Mosteller  and  Tukey  (1977),  Huber  (1981)  and  Hampel  (1986).  The  Cauchy 
distribution  is  a  symmetric  distribution  with  thicker  tails  than  those  of  the 
normal  distribution.  Distributions  with  long  straggling  tails  have  the 
tendency  to  produce  outlying  values.  The  following  models  use  the  Cauchy 
distribution  to  represent  and  suitably  compensate  for  more-thick-tailed 
measurement  error  than  that  of  the  Normal  distribution. 
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One  Variable  Models 


4.  {Y(0}  are  independent  Cauchy  random  variables  with  scale  parameter 

of  (l;t)  =  exp{ai(l)  +  ft(l)r(t)}.  (4) 

5.  {Y(f)}  are  independent  Cauchy  random  variables  with  scale  parameter 

of  (2;  t)  =  exp{ai  (2)  +  fii  (2)s(f )}.  (5) 

Two  Variable  Model 

6.  {y(f))  are  independent  Cauchy  random  variables  with  scale  parameter 

ai{t)  =  exp{a  +  Mt)  +  p2sit)}-  (6) 

The  form  of  the  Cauchy  density  function  with  scale  parameter  a  that  is 
used  is 

2 
V 

1  +  for  -  oo  <  y  <  oo. 


no 


2.  ESTIMATION  OF  PARAMETERS 

For  both  normal  and  Cauchy  models,  the  model  parameters  are  estimated 
by  maximum  likelihood.  A  system  of  equations  is  obtained  by  setting  the  first 
partial  derivative  with  respect  to  each  parameter  of  the  In  likelihood  function 
equal  to  zero.  The  system  of  equations  is  solved  numerically  using  Newton's 
method  to  obtain  the  maximum  likelihood  estimates.  The  procedure  for  the 
normal  models  is  given  in  Appendix  A. 
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3.  THE  DATA  ANALYSIS— FEBRUARY  DATA 
3.1  Observed  Wind  Covariate  Models 

In  this  subsection  we  report  an  assessment  of  the  goodness  of  fit  and 
cross-validation  for  the  normal  models  (l)-(3)  using  observational  wind 
components  as  covariates.  There  are  six  analyses;  one  for  the  u-wind 
component  (respectively  u-wind  component)  for  each  pressure  level  height. 
Each  analysis  proceeds  along  the  same  lines.  In  what  follows  by  data  we  mean 
triples  {y(f),  r(f),  s(t)}. 

In  each  analysis  the  data  are  randomly  divided  into  two  sets  called  DA 
and  D3  without  regard  to  the  values  of  the  data. 

The  maximum  likelihood  parameter  estimates  for  each  model  {l)-(3)  are 

obtained  for  each  set  DA  and  DB  and  for  all  the  data.  The  estimated  values 

2  2  2 

appear  in  Table  1.  The  estimated  variances  (7i(2,f),  0^(0,  are  computed 

for  the  parameters  estimated  from  DA  and  DB  using  (l)-(3)‘ for  each  data 
point  in  DA  and  DB. 

The  models  are  for  the  variances  of  the  observations  rather  than  the 
observations  themselves.  One  possible  procedure  to  assess  goodness-of-fit 
and  cross-validate  the  models  is  by  binning  the  data.  To  assess  models  (1)  and 
(3)  the  data  (y(t),  r(t),  s(t))  are  binned  into  10  bins  based  on  ordering  the  values 
of  r(t)  from  smallest  to  largest.  The  data  in  the  first  bin  correspond  to  the 
smallest  values  of  r(t);  the  data  in  the  10th  bin  correspond  to  the  largest 
values  of  r(t).  Each  bin  contains  about  of  the  data  with  the  10th  bin 
containing  a  few  more  data.  The  averages  of  the  estimated  variances  for 
models  (1)  and  (3)  are  computed  for  each  bin.  The  average  y(t)2  is  also 
computed  for  each  bin. 
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To  assess  models  (2)  and  (3)  the  same  procedure  is  used  but  the  binning  is 
based  on  the  values  of  sit). 

Figures  1-24  present  graphs  of  the  In  [average  y(0^]  in  each  bin  versus  In 
[average  estimated  variance]  in  each  bin  for  models  (1)  and  (3)  and  models  (2) 
and  (3).  Figures  1,  5,  9,  13,  17,  21  (respectively  2,  6,  10,  14,  18,  22)  show  the 
logarithm  of  the  average  of  the  y(f)2  values  of  DA  (respectively  DB)  versus  the 
logarithm  of  the  average  of  the  estimated  variances  for  each  bin  using  the 
estimated  parameters  from  DA  (respectively  DB).  If  a  model  were  perfect,  a 
point  should  be  close  to  the  45°  line  shown. 

Figures  3,  7, 11, 15, 19,  23,  (respectively  4, 8, 12, 16,  20,  24)  present  graphs  of 
In  average  y(f)2  of  DA  (respectively  DB)  versus  In  average  estimated  variances 
using  parameters  estimated  using  data  DB  (respectively  DA).  Once  again  if 
the  model  were  perfect,  the  points  would  be  close  to  the  45°  line. 

Since  the  two-variate  model  (3)  is  shown  with  both  one-variate  models,  it 
is  possible  to  obtain  some  idea  of  the  effect  of  the  two  different  sets  of  bins  on 
the  In  averages.  In  particular,  the  graphs  corresponding  to  the  500  Mb  height 
winds.  Figures  9-16,  show  that  the  display  of  In  averages  can  be  quite  sensitive 
to  which  variate  is  used  to  do  the  binning. 

Keeping  this  binning  sensitivity  in  mind,  the  figures  suggest  the 
following  concerning  the  models  using  observed  winds  as  covariates.  It 
appears  that  of  the  two  one-variate  models,  model  (1)  which  uses  r(f)  as  the 
covariate  is  the  better.  The  two-variate  model  (3)  appears  not  much  better 
than  model  (1).  If  wind  speed  is  used  as  the  single  covariate,  it  appears  to 
overstate  the  variance;  the  addition  of  the  second  covariate  r(f)  in  this  case 
seems  to  tend  to  make  the  estimated  variance  smaller  and  bring  the  In 
average  predicted  variance  in  a  bin  closer  to  the  In  average  y^  in  the  bin. 
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Preliminary  examination  of  In  average  in  bins  and  In  average  model 
variances  in  bins  for  the  Cauchy  models  suggests  that  the  Cauchy  models 
result  in  little  or  no  improvement  over  the  results  of  the  normal  model.  The 
results  of  the  Cauchy  models  will  not  be  reported  here. 

Another  way  to  assess  goodness  of  fit  and  to  cross  validate  is  to  evaluate 
the  In-likelihood  for  the  different  models  at  the  parameter  estimates.  Larger 
values  of  the  In-likelihood  suggest  better  model  fit;  cf.  Cox  and  Hinkley  [1974]. 

Table  2  presents  the  values  of  the  In-likelihood  up  to  addition  and 
multiplication  of  constants  for  the  parameter  estimates  of  Table  1;  the 
function  being  evaluated  is 

i  =  -na-  exp{-a  +  (7) 

1=1  1=1 

where  The  values  of  i  are  presented  for  data  DA  (respectively 

i 

DB)  using  the  parameters  fit  using  DA  (respectively  DB);  these  are  values 
assessing  goodness  of  fit;  since  maximum  likelihood  is  the  estimation 
procedure,  the  largest  value  of  ~i  in  each  of  these  two  rows  is  the  one 
corresponding  to  the  two-variate  model.  Values  of  /  are  also  presented  for 
data  DA  (respectively  DB)  using  the  parameters  fit  using  DB  (respectively 
DA);  these  ar®  values  assessing  cross-validation.  The  underlined  value  in 
each  row  is  the  maximum  value  in  that  row;  the  corresponding  model 
provides  the  best  model  fit.  The  bold  italicized  value  in  each  row  is  the 
maximum  value  for  the  two  one-variate  models;  the  corresponding  one- 
variate  model  provides  the  best  model  fit  between  the  two  one-variate 
models. 


6 


TABLE  1.  NORMAL  MODELS 
PARAMETER  ESTIMATES 
OBSERVED  WIND  COVARIATES 

One-Variate  Models  Two-Variate  Models 


Pressure 

Wind 

Data 

r(t) 

s(0 

In  MSE  =  a+ftr(f)+^  2s(f) 

Height 

Comp. 

Set 

a 

a 

a 

A 

850 

U 

2.02 

0.054 

1.94 

0.050 

1.70 

0.040 

0.040 

2.09 

0.050 

1.76 

0.066 

1.63 

0.027 

0.058 

2.06 

0.052 

1.85 

0.058 

1.66 

0.034 

0.049 

850 

V 

2.19 

0.040 

1.59 

0.080 

1.51 

0.015 

0.076 

2.05 

0.051 

1.68 

0.071 

1.56 

0.028 

0.062 

2.12 

0.045 

1.64 

0.076 

1.53 

0.022 

0.069 

500 

u 

2.29 

0.045 

2.45 

0.018 

2.11 

0.040 

0.011 

2.18 

0.054 

2.19 

0.029 

1.84 

0.046 

2.23 

0.050 

2.32 

0.024 

1.97 

0.043 

0.015 

500 

V 

2.31 

0.039 

2.27 

0.023 

1.99 

0.033 

0.018 

2.24 

0.042 

2.14 

0.028 

1.89 

0.034 

0.021 

2.28 

0.041 

2.21 

0.025 

1.94 

0.034 

0.019 

250 

u 

3.12 

0.034 

2.48 

0.034 

2.22 

0.023 

0.030 

2.95 

0.039 

2.45 

0.032 

2.17 

0.026 

0.027 

3.04 

0.036 

2.46 

0.033 

2.20 

0.024 

0.028 

250 

V 

A 

3.01 

0.031 

2.36 

0.033 

2.13 

0.021 

0.029 

B 

2.97 

0.032 

2.28 

0.035 

2.12 

0.021 

0.028 

ALL 

2.98 

0.031 

2.31 

0.034 

2.12 

0.021 

0.029 

r(t)  =  [((u(f)  -  u(f-l))2  +  ivit)  - 

s(0  =  lM(f)2  + 


NOTE:  Data  are  divided  into  two  sets  randomly  without  regard  to  data 
values.  One  set  is  called  A;  the  other  is  called  B. 
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TABLE  2.  NORMAL  MODELS 
VALUES  OF  LN-LIKELIHOOD 
OBSERVED  WIND  COVARIATES 


One-Variate 

Two- 

Pressure 

Wind 

Models 

Variate 

Height 

Comp.  Data  Set  Model  Constant 

rit)  s(t) 

Models 

850 

u 

\ 

A 

B 

B 

A 

A 

B 

A 

B 

-7695.9 

-7746.9 

-7747.5 

-7696.4 

-7596.5 

-7661.8 

-7663.9 

-7598.5 

-7591.6 

-7560.5 

-7571.1 

-7601.9 

zism 

-7540.3 

-7553.2 

-7551.7 

850 

V 

A 

A 

-7759.1 

-7703.4 

-7505.5 

-7498.5 

B 

B 

-7707.6 

-7614.0 

-7512.8 

-7489.4 

B 

A 

-7708.2 

-7620.6 

-7515.4 

-7497.5 

A 

B 

-7759.7 

-7710.3 

-7508.2 

500 

u 

A 

A 

-9454.3 

-9314.6 

-9405.3 

-9299.2 

B 

B 

-9518.7 

-9296.8 

-9376.7 

=2222:6 

B 

A 

-9519.5 

-9303.2 

-9397,9  . 

-9257.4 

A 

B 

-9455.2 

-9320.4 

-9425.2 

-9315.8 

500 

V 

A 

A 

-9317.7 

-9317.1 

-9243.0 

-9174.9 

B 

B 

-9258.3 

-9140.7 

-9161.1 

-9091.7 

B 

A 

-9259.1 

-9142.8 

-9165.1 

=20242 

A 

B 

-9318.4 

-9219.2 

-9247.5 

250 

u 

A 

A 

-11265.4 

-10657.4 

-10431.5 

B 

B 

-10782.7 

-10389.8 

-10249.0 

-10149.2 

B 

A 

-10829.6 

-10403.9 

-10261.7 

-10162.1 

A 

B 

-11319.3 

-10673.4 

-10445.1 

250 

V 

A 

A 

-10417.8 

-10259.4 

-10094.9 

-10032.0 

B 

B 

-10783.1 

-10181.5 

-10050.1 

-9960.1 

B 

A 

-10814.9 

-10182.5 

-1O051.8 

=226L2 

A 

B 

-10446.4 

-10260.3 

-10096.3 

-10033.2 
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The  models  considered  are  (Y,)  are  independent  normal  with  mean  0  and 
variance 


o^(t)  =  (Constant  variance) 


(8) 


and  models  (l)-(3). 

The  two-variate  model  (3)  maximizes  the  cross-validation  value  of  /  for 
data  DA  (respectively  DB)  with  a  model  using  parameters  fit  using  DB 
(respectively  DA).  This  suggests  that  both  r(t)  and  s(f)  together  have 
predictive  ability. 

For  the  one-variate  models  (1)  and  (2)  the  cross-validation  value  of  i  for 
DA  (respectively  DB)  using  the  parameters  fit  using  DB  (respectively  DA)  are 
equally  divided  as  to  whether  r(0  by  itself  or  s(f)  by  itself  produces  the  higher 
value  of  /.  This  suggests  that  neither  variate  by  itself  has  obviously  better 
predictive  value  than  the  other.  The  goodness  of  fit  values  of  ~t  for  the  one- 
variate  models  using  DA  (respectively  DB)  have  a  higher  value  of  i 
associated  with  s(0  the  majority  of  the  time.  This  suggests  that  s(t)  by  itself 

provides  a  better  description  of  the  data  than  r(t)  by  itself. 

Comparing  the  value  of  £,  ~lc,  for  DA  (respectively  DB)  using  the  constant 
variance  model  (8)  fit  using  DA  (respectively  DB)  with  the  corresponding 
cross-validation  value  of  /  for  DA  (respectively  DB)  using  models  (2),  (3)  fit 
using  DB  (respectively  DA)  indicates  the  following.  The  values  of  ~t  for 
models  (2)  and  (3)  fit  with  the  other  half  of  the  data  are  larger  than  the 
corresponding  value  for  the  constant  variance  model  fit  using  the  data  to 
be  modeled.  This  indicates  that  both  models  (2)  and  (3)  fit  with  the  other  half 
of  the  data  describe  the  data  better  than  the  best  constant  variance  model  (8)  fit 
with  the  same  data  it  is  used  to  summarize. 
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3.2  First  Guess  Wind  Covariate  Models 

In  this  section  we  report  the  results  of  using  models  (l)-(3)  and  (8)  with 
first  guess  winds  as  covariates;  the  two  covariates  considered  are 


I 

>■/(') = [K(')  -  W/(i  -  I)f  +(v,(i)-Vi(t  -  I)f 


and 


1 

S/W  =  [W/((f +  V/(f)2]2. 

The  analysis  is  the  same  as  in  the  previous  subsection.  The  data  sets  DA  and 
DB  are  the  same  as  those  in  the  previous  subsection  in  each  case. 

The  values  of  the  parameter  estimates  appear  in  Table  3.  The 
corresponding  values  of  I  appear  in  Table  4.  Once  again  the  underlined 
value  of  i  is  the  largest  value  in  each  row;  the  bold  italicized  value  }  is  the 
largest  value  between  the  two  one-variate  models. 

In  all  but  two  cases  the  values  of  i  for  the  observed  wind  covariates  are 
larger  than  those  for  the  first-guess  wind  covariates.  This  suggests  that  the 
observed  wind  components  have  better  predictive  and  descriptive  value  than 
the  first  guess  wind  components. 

Table  4  also  indicates  the  following  results  concerning  models  u  .ing  first 
guess  wind  covariates.  Between  the  two  one-variate  models  (1)  ana  (2)  the 
one-variate  model  using  first  guess  wind  speed  always  has  the  greater  i- 
value.  This  suggests  that  first  guess  wind  speed  alone  has  better  predictive 
and  descriptive  value  than  rf{t)  alone.  The  cross-validation  values  of  t  for 
data  DA  (respectively  DB)  using  parameters  fit  with  DB  (respectively  DA)  are 
maximized  about  half  the  time  using  the  one- variate  model  with  s^f).  The 
other  times  the  maximal  /  is  associated  with  the  two-variate  model. 
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TABLES.  NORMAL  MODELS 
PARAMETER  ESTIMATES 
HRST  GUESS  WIND  COVARIATES 

One-variate  Models  Two  Variate  Models 

Pressure  Wind  Data  r^t)  s^t) 


Heights 

Gxnp. 

Set 

a 

a 

a 

A 

850 

u 

A 

2.52 

-0.006 

2.23 

0.025 

2.30 

-0.023 

0.031 

B 

2.48 

0.004 

2.21 

0.029 

2.26 

-0.013 

0.032 

ALL 

2.50 

-41.0007 

2.22 

0.027 

2.28 

-0.017 

0.032 

850 

V 

A 

2.54 

-41.004 

2.34 

0.017 

2.40 

-0.016 

0.021 

B 

2.41 

0.013 

2.27 

0.021 

2.28 

-0.001 

0.022 

ALL 

2.47 

0.005 

2.31 

0.019 

2.34 

-0.008 

0.021 

500 

u 

A 

2.61 

0.023 

2.35 

0.023 

2.25 

0.015 

0.021 

B 

2.68 

0.019 

2.48 

0.017 

2.39 

0.014 

0.016 

ALL 

2.65 

0.021 

2.42 

0.020 

2.31 

0.014 

0.018 

500 

V 

A 

2.71 

0.006 

2.40 

0.018 

2.39 

0.017 

B 

2.76 

-0.002 

2.29 

0.022 

2.35 

0.023 

ALL 

2.73 

0.002 

2.34 

0.020 

2.37 

0.020 

250 

u 

A 

4.02 

-0.005 

2.66 

0.037 

2.67 

-0.002 

0.037 

B 

3.46 

0.021 

3.01 

0.022 

2.78 

0.019 

0.021 

ALL 

3.74 

0.009 

2.79 

0.031 

2.69 

0.008 

0.031 

250 

V 

A 

3.49 

0.007 

3.00 

0.017 

2.93 

0.006 

0.017 

B 

3.53 

0.016 

2.87 

0.026 

2.60 

0.018 

0.026 

ALL 

3.50 

0.012 

2.92 

0.022 

2.75 

0.013 

0.022 
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TABLE  4.  NORMAL  MODEL 
FIRST  GUESS  WIND  COVARIATES 
VALUE  OF  LIKELIHOOD 

One  Variate  Two 


Pressure 

Wind 

Data 

Models 

Variate 

Height 

^Comg. 

Set 

Model 

Constant 

rfit) 

Models 

850 

u 

A 

A 

-7695.9 

-7695.1 

-7667.6 

-7657.9 

B 

B 

-7746.9 

-7746.6 

-7705.6 

B 

A 

-7747.5 

-7749.5 

-7708.8 

A 

B 

-7696.4 

-7697.7 

-7668.6 

-7660.9 

850 

V 

A 

A 

-7759.1 

-7758.7 

-7745.4 

-7741.1 

B 

B 

-7707.6 

-7703.7 

-7685.5 

-7685.5 

B 

A 

-7708.2 

-7711.3 

-768L3 

-7691.8 

A 

B 

-7759.7 

-7765.3 

-7747.3 

-7746.7 

500 

u 

A 

A 

-9454.3 

-9433.7 

-9391.1 

-9383.0 

B 

B 

-9518.7 

-9505.7 

-9481.9 

-9475.2 

B 

A 

-9519.5 

-9507.5 

-9486.2 

-9479.1 

A 

B 

-9455.2 

-9435.5 

-9395.4 

-9387.0 

500 

V 

A 

A 

-9317.7 

-931 6.b 

-9281.2 

-9281.2 

B 

B 

-9258.3 

-9258.2 

-9202.6 

-9199.5 

B 

A 

-9259.1 

-9261.5 

-9206.2 

A 

B 

-9318.4 

-9319.7 

-9284.3 

-9288.4 

250 

u 

A 

A 

-11265.4 

-11263.9 

-10907.3 

-10907.0 

B 

B 

-10782.7 

-10745.7 

-10684.5 

-10653.5 

B 

A 

-10829.6 

-10846.5 

-10745.7 

A 

B 

-11319.3 

-11371.6 

-11035.7 

250 

V 

A 

A 

-10417.8 

-10414.2 

-10349.3 

-10346.7 

B 

B 

-10783.1 

-10758.9 

-10622.4 

-10587.8 

B 

A 

-10814.9 

-10796.9 

-10658.7 

-10640.3 

A 

B 

-10446.4 

-10446.4 

Bimaijim 

-10389.1 
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Comparing  the  value  of  /,  ic,  for  DA  (respectively  DB)  using  the  constant 
variance  model  (8)  fit  using  DA  (respectively  DB)  with  the  cross-validation 
value  of  i  for  DA  (respectively  DB)  using  models  (2),  (3)  fit  using  DB 
(respectively  DA)  indicates  the  following.  The  values  of  t  for  models  (2)  and 
(3)  fit  with  the  other  half  of  the  data  are  always  larger  than  the  corresponding 
value  tc  for  the  constant  variance  model  fit  using  the  data  to  be  modeled. 
This  suggests  that  both  models  (2)  and  (3)  fit  with  the  other  half  of  the  data 
describe  the  data  somewhat  better  than  the  best  constant  variance  model  (8)  fit 
with  the  data  to  be  described. 

In  summary,  based  on  values  of  i,  when  first  guess  winds  are  used  as 
covariates  it  appears  that  the  one-variate  model  using  first  guess  wind  speed 
is  an  attractive  choice  for  predictive  purposes.  When  observational  winds  are 
used  as  covariates,  the  two-variate  model  appears  to  have  the  best  predictive 
value. 

Assessing  goodness  of  fit  and  cross-validation  using  values  of  2  has  the 
advantage  of  not  being  sensitive  to  binning.  However,  2  may  be  sensitive  to 
data  sets  DA  and  DB.  Further  work  needs  to  be  done  to  develop  procedures  to 
assess  goodness  of  fit  and  for  cross-validation.  Procedures  based  on 
bootstrapping  or  jackknifing  hold  some  promise. 

4.  THE  DATA  ANALYSIS— APRIL  AND  FEBRUARY  DATA 

In  this  section  we  report  results  of  an  assessment  of  goodness  of  fit  for  the 
normal  models  (l)-(3)  for  April  data.  We  also  report  results  concerning  using 
a  model  whose  parameters  are  fit  using  February  data  (respectively  April)  data 
to  model  April  data  (respectively  February)  data. 
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4.1  Observed  Wind  Covariate  Models 

In  this  subsection  we  report  results  for  normal  models  (l)-(3)  using 
observed  wind  components  as  covariates.  There  are  six  analyses;  one  for  the 
u-wind  component  (respectively  n-wind  component)  for  each  pressure 
height. 

Table  5  shows  the  values  of  the  parameter  estimates  for  both  the  February 
data  and  April  data.  Table  6  shows  the  values  of  /  for  February  data 
(respectively  April  data)  using  parameters  fit  using  February  data  (respectively 
April  data).  Values  of  ~t  are  also  presented  for  February  data  (respectively 
April  data)  using  parameters  fit  using  April  data  (respectively  February  data). 
Once  again,  larger  values  of  ~l  indicate  better  model  fit.  The  underlined  value 
in  each  row  is  the  maximum  value  in  that  row.  The  bold  italicized  value  in 
each  row  is  the  maximum  value  of  I  for  the  two  one-variate  models. 

The  values  of  ~t  for  February  data  (respectively  April  data)  using 
parameters  fit  using  April  data  (respectively  February  data)  are  maximized  by 
the  two-variate  model  in  all  but  one  case;  between  the  two  one-variate 
models  /  is  the  maximized  half  the  time  for  the  model  involving  s(t). 

Comparing  the  value  of  ~l,  ~lz,  for  the  model  of  constant  variance  (8)  for 
February  (respectively  April)  data  fit  using  February  (respectively  April)  data 
with  that  for  the  prediction  value  of  ~l  for  the  models  (2)-(3)  for  February 
(respectively  April)  data  fit  using  April  (respectively  February)  data  indicate 
the  following.  The  values  of  ~t  for  models  (2)  and  (3)  fit  with  data  from  the 
other  month  are  always  larger  than  the  corresponding  values  of  /c  fil  with 
the  data  of  the  same  month.  This  suggests  that  models  (2)  and  (3)  fit  using 
data  from  the  other  month  have  predictive  value  over  a  model  of  constant 
variance  fit  using  the  data  that  is  to  be  modeled. 
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TABLES.  NORMAL  MODELS 
PARAMETER  ESTIMATES 
OBSERVED  WIND  COVARIATES 

One- Variate  Models  Two- Variate  Models 


Pressure 

Height 

Wiird 

Q)mp. 

Data 

Set 

a 

r(t) 

a 

s«) 

In  MSE  =  a+Pir{t)+P  ^(t) 
a  Pi  A 

850 

U 

Feb. 

2.06 

0.052 

1.85 

0.058 

1.66 

0.034 

0.049 

Apr. 

1.86 

0.084 

1.69 

0.086 

1.44 

0.053 

0.068 

850 

V 

Feb. 

2.12 

0.045 

1.64 

0.076 

1.53 

0.022 

0.069 

Apr. 

1.83 

0.089 

1.69 

0.090 

1.42 

0.062 

0.065 

500 

U 

Feb. 

2.23 

0.050 

2.32 

0.024 

1.97 

0.043 

0.015 

2.12 

0.058 

2.20 

0.030 

1.80 

0.049 

0.023 

500 

V 

Feb. 

2.28 

0.041 

2.21 

0.024 

1.94 

0.034 

0.019 

Apr. 

2.02 

0.065 

1.97 

0.041 

1.66 

0.052 

0.027 

250 

u 

Feb. 

3.04 

0.036 

2.46 

0.033 

2.20 

0.024 

0.028 

Apr. 

2.77 

0.044 

2.69 

0.027 

2.33 

0.037 

0.018 

250 

V 

Feb. 

2.98 

0.031 

2.31 

0.034 

2.12 

0.021 

0.029 

Apr. 

2.73 

0.041 

2.61 

0.027 

2.26 

0.034 

0.019 

r(t)  =  [((«(/)  -  m(/-1))2  +  (v(t)  - 
s(t)  =  [u(t)2  +  v(t)2]^^^ 
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TABLE  6.  NORMAL  MODELS 
VALUES  OF  LN-LIKELIHOOD 
OBSERVED  WIND  COVARIATES 

One-Variate  Two- 

Pressure  Wind  Models  Variate 


Height 

Comp. 

DataSet 

Model 

Constant 

At) 

s(t) 

Models 

850 

u 

Feb. 

-15443.1 

-15259.3 

-15157.3 

Apr. 

-15709.1 

-15311.6 

-15169.9 

Apr. 

-15713.2 

-15368.2 

-15238.7 

-15117.7 

Feb. 

-15447.0 

-15321.2 

-15240.3 

-15171.9 

850 

V 

Feb. 

-15467.0 

-15320.8 

-15019.6 

-14992.1 

Apr. 

-15819.1 

-15320.6 

-15307.2 

-15106.4 

Apr. 

-15827.8 

-15429.9 

-15386.7 

-15251.5 

Feb. 

BSHI 

-15475.3 

-15428.6 

-15105,9 

-15121.1 

500 

u 

Feb. 

-18973.4 

-18614.5 

-18792.2 

-18547.4 

Apr. 

-18504.1 

-18083.1 

-18270.7 

-17969.4 

Apr. 

-18528.9 

-18093.4 

-18280.6 

-17989.1 

Feb. 

-18999.9 

-18625.3 

-18804.3 

-18576.6 

500 

V 

Feb. 

mm 

-18576.4 

-18358.8 

-18406.2 

-18267.9 

Apr. 

-18698.2 

-17869.0 

-18014.4 

-17733.0 

Apr. 

-18699.0 

-17938.0 

-18086.9 

-17796,2 

Feb. 

-18577.1 

-18421.3 

-18480.2 

-18345.2 

250 

u 

Feb. 

BH 

-22073.2 

-21054.7 

-20687.1 

-20500.6 

Apr. 

-22712.2 

-20364.3 

-20658.3 

-20195.9 

Apr. 

-22712.4 

-20439.9 

-20703.3 

-20276.4 

Feb. 

-22073.4 

-21127.5 

-20726.0 

-20591.9 

250 

V 

Feb. 

-21216.0 

-20441.4 

-20145.7 

-19992.6 

Apr. 

-21205.7 

-20006.9 

-20274.8 

-19840.6 

Apr. 

-21240.2 

-20065.6 

-20338.7 

-19924.9 

Feb. 

-21252.5 

-20498.8 

-20190.2 

-20070.5 
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4.2  First  Guess  Wind  Covariate  Models 

In  this  section  we  report  results  for  normal  models  (l)-(3)  using  first 
guess  wind  components  as  covariates. 

Table  7  shows  the  values  of  the  parameter  estimates  for  both  February 
data  and  April  data.  Table  8  shows  the  values  of  /  for  February  data 
(respectively  April  data)  using  parameters  fit  using  February  data  (resp)ectively 
April  data).  Values  of  /  are  also  presented  for  February  data  (respectively 
April  data)  using  parameters  fit  using  April  data  (respectively  February  data). 
The  underlined  value  in  each  row  is  the  maximum  value  in  that  row.  The 
bold  italicized  value  in  each  row  is  the  maximum  value  of  ~l  for  the  two  one- 
variate  models. 

The  values  of  ~l  for  the  observed  wind  covariates  are  larger  than  those  for 
the  first  guess  wind  covariates  except  for  two  values  associated  with  the  one- 
variate  model  using  s(f)  to  model  w-wind  component  error  at  the  250  mb 
height  for  the  model  using  parameters  fit  with  the  same  data.  This  suggests 
that  the  observed  wind  covariates  provide  a  better  model  of  the  data  both  in 
terms  of  goodness-of-fit  and  prediction. 

The  values  of  ~l  for  February  data  (respectively  April  data)  using 
parameters  fit  using  April  data  (respectively  February  data)  are  maximized 
about  half  the  time  by  the  two-variate  model  and  the  other  half  the  time  by 
the  one-variate  model  using  the  first  guess  wind  speed  s(t). 
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TABLE  7.  NORMAL  MODELS 
PARAMETER  ESTIMATES 
FIRST  GUESS  WIND  COVARIATES 

One-variate  Models  Two-Variate  Models 

Pressure  Wind  Data  r^t)  Sy(t) 


Heights 

Comg. 

Set 

a 

P 

a 

850 

u 

2.50 

-0.0007 

2.22 

0.027 

2.28 

-0.017 

0.032 

2.42 

0.022 

2.19 

0.041 

2.19 

-0.002 

0.041 

850 

V 

2.47 

0.005 

2.31 

0.019 

2.34 

-0.008 

0.021 

2.46 

0.019 

2.23 

0.039 

2.24 

-0.004 

0.040 

500 

u 

2.65 

0.021 

2.41 

0.020 

2.32 

0.014 

0.018 

2.50 

0.031 

2.23 

0.030 

2.11 

0.019 

0.028 

500 

V 

g 

2.73 

0.002 

2.34 

0.020 

2.37 

-0.004 

0.020 

2.30 

0.061 

1.98 

0.045 

1.74 

0.040 

0.040 

250 

u 

3.74 

0.009 

2.79 

0.031 

2.69 

.0.008 

0.031 

4.00 

-0.010 

3.48 

0.014 

3.63 

-0.021 

0.016 

250 

V 

3.50 

0.012 

2.92 

0.022 

2.75 

0.013 

0.022 

HKI 

3.26 

0.025 

2.80 

0.026 

2.68 

0.015 

0.024 
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TABLE  8.  NORMAL  MODEL 
VALUE  OF  LN-LIKELIHOOD 
FIRST  GUESS  WIND  COVARIATES 

One-Variate  Two- 


Pressure 

Wind 

C^ta 

Models 

Variate 

Height 

Comp. 

Set 

Model 

Constant 

Models 

850 

u 

Feb. 

-15443.1 

-15443.0 

-15376.0 

-15365.0 

Apr. 

-15709.1 

-15695.3 

-15593.3 

-15593.1 

Feb. 

-15713.2 

-15714.0 

WlMiJfM 

-15624.2 

Apr. 

-15447.0 

-15471.3 

-15413.2 

-15409.9 

850 

V 

Feb. 

-15467.0 

-15466.0 

-15431.8 

-15429.5 

Apr. 

-15819.1 

-15808.8 

-15716.0 

-15715.7 

Apr. 

-15827.8 

-15824.2 

-15759.4 

-15757.7 

Feb. 

-15475.3 

-15846.6 

-15493.2 

-15490.5 

500 

u 

Feb. 

-18973.4 

-18940.3 

-18875.1 

-18860.2 

Apr. 

-18504.1 

-18455.5 

-18331.3 

-18312.9 

Feb. 

-18528.9 

-18473.2 

-18351.3 

-18332.6 

Apr. 

-18999.9 

-18956.6 

-18897.5 

-18885.5 

500 

V 

Feb. 

Feb. 

-18576.4 

-18575.9 

-18485.3 

-18484.1 

Apr. 

Apr. 

-18698.2 

-18509.1 

-18289.2 

-18208.2 

Apr. 

Feb. 

-18699.0 

-18683.4 

-18441.4 

Feb. 

Apr. 

-18577.1 

-18805.8 

-18656.5 

-18778.6 

250 

u 

Feb. 

-22073.2 

-22061.3 

-21624.3 

-21613.7 

Apr. 

-22712.2 

-22699.3 

-22641.2 

-22609.5 

Feb. 

-22712  4 

-22739.2 

-22906.5 

-22987.4 

Apr. 

-22073.4 

-22139.0 

-21792.9 

-21863.5 

250 

V 

Feb. 

-21216.0 

-21190.5 

-20988.0 

-20957.8 

Apr. 

-21205.7 

-21142.9 

-20919.7 

-20899.6 

Apr. 
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A  comparison  of  the  value  of  2,  2c,  for  the  constant  variance  model  of 
February  (respectively  April)  data  fit  using  the  same  month  February 
(respectively  April)  data  and  the  prediction  values  of  2  for  models 
(l)-(3)  of  February  (respectively  April)  data  fit  using  the  other  month  of  April 
(respectively  February)  indicate  the  following.  A  little  fewer  than  half  the 
time  2c  is  smaller  than  the  corresponding  values  of  2  for  models  (l)-(3)  fit 
with  the  other  month's  data.  This  suggests  that  the  first-guess  wind  speed 
models  fit  using  the  other  month's  data  may  not  describe  the  data  as  well  as  a 
constant  variance  model  fit  using  the  data  being  modeled.  This  may  be  an 
indication  that  models  fit  using  first-guess  February  wind  (respectively  April 
wind)  data  are  not  good  predictors  of  April  (respectively  February)  wind 
component  error. 

4.3  Conclusions 

Models  (2)  and  (3)  using  observed  wind  components  as  covariates  and  fit 
using  February  (respectively  April)  data  appear  to  have  predictive  value  for 
April  (respectively  February)  data.  It  is  less  clear  if  models  (l)-(3)  using  first- 
guess  wind  components  as  covariates  and  fit  using  February  (respectively 
April)  data  have  predictive  value  for  April  (respectively  February)  wind 
component  error  data.  It  might  be  that  models  (l)-(3)  fit  with  first-guess  data 
from  other  Aprils  (respectively  Februarys)  are  better  predictors  of  April 
(respectively  February)  wind  component  error.  Alternatively,  if  first-guess 
winds  are  to  be  used  as  predictors,  it  might  be  worthwhile  to  develop  a 
procedure  to  update  the  fitted  model  parameters  using  new  data  as  it  comes 
in. 
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APPENDIX  A 


MAXIMUM  LIKELIHOOD  ESTIMATION  FOR  THE  NORMAL  MODEL 


Let  Yi,  Yi,  Yn  be  independent  normal  random  variables  with  mean  0 
and  variances 


(T,  =  exp 


a  +  I  XijPj 
L  /-I  J 


exp{a  +  X/^}  /  =  (A.l) 


where  (x/i,  ...,Xip)  are  fixed  explanatory  variables  associated  with  Y). 
The  likelihood  function  for  this  model  is 

-  L(a,^;y)  =  I][^exp|--(a+i,^)jexpj^--yf  exp{-(a +x,.g)} 
Hence,  the  In-likelihood  function  is 


(A.  2) 


H  « 


- 1  *  ly/  exp{-(a  +  I/g)} 

j“l  i-l 


-  n— ln2;r.  (A.3) 
2 


Computing  partial  derivatives  of  £  with  respect  to  a  and  results  in 


■  i 


■«  +  Zy?exp{-(a+2,g)} 

j-l 


(A.4) 


n  n 


E^/;  +  Iy/ exp{-(a  +  2,g)}xy 
L  j-l  ;-l 


(A.5) 


Setting  ^  ^  =  0  results  in  the  equation 


“7ly/ exp{-2,-^}. 


(A.6) 
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where  is  the  current  value  for  p.  This  system  of  linear  equations  is  solved 
for  {fik}-  The  Newton  procedure  is  iterated  until  it  converges.  The  resulting 
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(A.7) 


Setting  ^  =  0  and  replacing  e®  by  (A.6)  yields  the  equation 


“ -m  ■  -^/I  *  i 

i-1  j-1 


where  x 


_  1 


/ 


-y 


Further, 


n  a  n  .v.n 

-  V'  2^-£;P  2  -  -«£ 


yh'~^-Xiic '  £y,- e'  ""XyX,jt. 
^Pk  i-l  i-1 


(A.8) 


lfffc(fi)  =  0,then 


-  V  2  -£i^  V  2 


(A.9) 


j-1  i-1 

Substituting  (A.9)  into  (A.8)  yields 


*  'I  yi^'^-ixijXik  -  XyX^). 
^Pk  i-1 


(A.IO) 


An  iteration  of  a  Newton  procedure  to  solve  the  system  of  equations  0  =  fj{S)> 
(j  =  1, ...,  p)  yields  the  system  of  linear  equations 

«■  Mg)  -gjl  (A  ll) 

-  I yh'^^^\xij  ■  xj)  -  j;  £ y?e -  fik)  (A.12) 


i-1 


where  is  the  current  value  for  p.  This  system  of  linear  equations  is  solved 
for  Ofc}.  The  Newton  procedure  is  iterated  until  it  converges.  The  resulting 
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Figure  14 
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Figure  17 
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Figiixe  24 


APPENDIX  B.  A  GRAPHICAL  ASSESSMENT  OF  GOODNESS  OF  FIT  AND 
CROSS-VALIDATION  OF  MODELS  OF  FEBRUARY  WIND  COMPONENT 
MEAN  SQUARE  ERROR  USING  FIRST-GUESS  WIND  COVARIATES 

In  this  appendix  we  present  figures  assessing  goodness  of  fit  and  cross- 
validation  of  the  normal  models  (l)-(3)  with  first-guess  wind  covariates  fit  to 
February  data.  As  in  subsection  (3.2)  the  data  is  randomly  divided  into  two 
sets  called  DA  and  DB  without  regard  to  the  values  of  the  data;  these  sets  are 
the  same  as  those  in  that  section. 

The  maximum  likelihood  parameter  estimates  for  each  model  (l)-(3)  are 

obtained  for  each  set  DA  and  DB  and  appear  in  Table  3.  The  estimated 
2  2  2 

variances  <T|(2,f),  Oiit)  are  computed  for  the  parameters  estimated 

from  DA  and  DB  using  (l)-(3)  for  each  data  p>oint  in  DA  and  DB. 

To  assess  models  (1)  and  (3)  the  data  (y(f),  rit),  sit))  are  binned  into  10  bins 

based  on  ordering  the  values  of  rit)  from  smallest  to  largest.  The  data  in  the 

first  bin  correspond  to  the  smaller  values  of  rit);  the  data  in  the  10‘*  bin 

correspond  to  the  larger  values  of  r(f).  Each  bin  contains  about  ^  of  the  data 

with  the  10**  bin  containing  a  few  more  data.  The  averages  of  the  estimated 

2 

variances  for  models  (1)  and  (3)  are  computed  for  each  bin.  The  average  yit) 
is  also  computed  for  each  bin. 

To  assess  models  (2)  and  (3)  the  same  procedure  is  used  but  the  binning  is 
based  on  values  of  sit). 

Figures  1B-24B  present  graphs  of  the  ln[average  y(f)*]  in  each  bin  versus 
In  [average  estimated  variance]  in  each  bin  for  models  (1)  and  (3)  and  models 
(2)  and  (3).  Figures  IB,  5B,  9B,  13B,  17B,  21B  (respectively  2B,  6B,  lOB,  14B,  18B 
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22B)  show  the  logarithm  of  the  average  of  the  y{lf  values  of  DA  (respectively 
DB)  versus  the  logarithm  of  the  average  of  the  estimated  variances  for  each 
bin  using  the  estimated  parameters  from  DA  (respectively  DB).  If  a  model 
were  perfect,  a  point  should  be  close  to  the  45®  line  shown.  These  figures 
assess  goodness  of  fit. 

Figures  3B,  7B,  IIB,  15B,  19B,  23B  (respectively  4B,  8B,  12B,  16B,  20B,  24B) 
present  graphs  of  In  average  y{tf  of  DA  (respectively  DB)  versus  In  average 
estimated  variances  using  parameters  estimated  using  data  DB  (respectively 
DA).  Once  again  if  the  model  were  perfect,  the  points  would  be  close  to  the 
45°  line. 

As  suggested  by  the  values  of  the  In-likelihood  in  Tables  2  and  4,  the 
figures  for  models  using  first  guess  covariates  indicate  weaker  goodness  of  fit 
and  weaker  cross-validation  than  Figures  1-24  for  models  with  observed  wind 
speed  covariates.  Both  goodness-of-fit  and  cross-validation  appear  to 
improve  somewhat  for  higher  pressure  height  levels;  Figures  17B-24B.  This 
suggests  that  models  using  first  guess  covariates  have  greater  predictive  and 
descriptive  value  at  250mb  height  levels.  However,  they  appear  to  be  not  as 
good  as  models  using  observed  wind  speed  as  covariates. 
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Figure  IB 
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Figure  2B 
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Figure  3B 
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Figure  4B 
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Figure  6B 
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Figure  lOB 
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Figure  17B 
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APPENDIX  C  GRAPHICAL  ASSESSMENT  OF  GOODNESS  OF  HT  AND 
CROSS-VALIDATION  OF  MODELS  FOR  FEBRUARY  AND  APRIL  WIND 

COMPONENT  MEAN  SQUARE  ERROR  USING  OBSERVED  WIND 

COVARIATES 

In  this  appendix  we  present  graphs  assessing  goodness  of  fit  and 
predictive  ability  of  the  normal  models  (l)-(3)  with  observed  wind  covariates 
fit  to  April  and  February  data. 

The  maximum  likelihood  parameter  estimates  for  each  model  (l)-(3)  are 

obtained  for  both  February  and  April  data  and  are  displayed  in  Table  5.  The 

2  2  2 

estimated  variances  <J2(t)  are  computed  for  the  parameters 

estimated  from  February  and  April  data  using  (l)-(3)  for  each  data  point  in 
February  and  April, 

To  assess  models  (1)  and  (3)  the  data  (y(t)/  r(t),  s(t))r  for  each'  data  set  are 
binned  into  10  bins  based  on  ordering  the  values  of  r(t)  from  smallest  to 
largest.  The  data  in  the  first  bin  correspond  to  the  smaller  values  of  r(f);  the 
data  in  the  10**  bin  correspond  to  the  larger  values  of  r{t).  Each  bin  contains 
about  ^  of  the  data  with  the  10  bin  containing  a  few  more.  The  averages  of 
the  estimated  variances  for  models  (1)  and  (3)  are  computed  for  each  bin.  TTie 
average  y(f)  is  also  computed  for  each  bin. 

To  assess  models  (2)  and  (3)  the  same  procedure  is  used  but  the  binning  is 
done  using  sit). 

Figures  1C-24C  present  graphs  of  the  ln[average  y(0^]  in  each  bin  versus 
ln[average  estimated  variance]  in  each  bin  for  models  (1)  and  (3)  and  models 
(2)  and  (3).  Figures  1C,  5C,  9C,  13C,  17C,  21C  (respectively  .2C/  6C;iOC/^^^  18C 


75 


2 

22C)  show  the  logarithm  of  the  average  of  the  y(f)  values  for  February 
(respectively  April)  versus  the  logarithm  of  the  average  of  the  estimated 
variances  for  each  bin  using  the  estimated  parameters  from  February 
(respectively  April).  If  a  model  were  perfect  a  point  should  be  close  to  the  45° 
line  shown.  These  figures  assess  goodness  of  fit. 

Figures  3C,  7C,  IIC,  15C,  19C,  23C  (respectively  4C,  8C,  12C,  16C,  20C,  24C) 
present  graphs  of  In  average  yit)^  of  February  (respectively  April)  versus  In 
average  estimated  variances  using  parameters  estimated  using  April 
(respectively  February)  data.  Once  again  if  the  model  were  perfect,  the  points 
would  be  close  to  the  45°  line.  These  figures  assess  the  ability  of  models  fit 
using  February  (respectively  April)  observed  data  to  predict  April 
(respectively  February)  wind  component  mean  square  error. 

The  figures  indicate  once  again  that  the  display  of  In  averages  can  be  quite 
sensitive  to  which  variate  is  used  to  do  the  binning.  .  . 

Keeping  this  binning  sensitivity  in  mind,  the  figures  suggest  the 
following.  The  two-variate  model  (3)  appears  to  best  describe  and  predict  the 
mean  square  component  wind  error.  Of  the  two  one-variable  models,  model 
(1)  which  uses  r(t)  as  the  covariate  appears  to  be  better.  The  one-variate 
model  using  s(t)  appears  to  tend  to  overstate  the  predicted  mean  square  error. 
The  addition  of  the  second  covariate  r(f)  to  the  one-variate  model  using  s(f) 
appears  to  tend  to  decrease  the  predicted  mean  square  error. 
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Figure  1C 


850  MB  APR  MODEL  ON  FEB  DATAiOBS  WIND 

ivar=r[t]=o;2var=+:0IN  on  r[t] 


79 


Figure  3C 
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Figure  4C 


850  MB  V  WIND;  FEB  MODEL  ON  FEB  DATA;OBS 

1VAR=R[T]=o;2VAR=+:BIN  ON  R[T] 


850  MB  V  WIND;  APR  MODEL  ON  APR  DATA;OBS  WIND 
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Figure  6C 


850  MB  V  WIND;  APR  MODEL  ON  FEB  DATA;OBS  WIND 
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850  MB  V  WIND;  FEB  MODEL  ON  APR  DATA;OBS  WIND 
1VAR=R[T]=o;2VAR=+:8IN  on  r[t] 


500  MB  U  WIND;  FEB  MODEL  ON  FEB  DATA;OBS  WIND 
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Figure  12C 


500  MB  V  WIND;  FEB  MODEL  ON  FEB  DATAjOBS  WIND 

1VAR=R[T]=»:2VAR=+;BIN  ON  R[T] 
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500  MB  V  WIND;  APR  MODEL  ON  FEB  DATA;OBS  WIND 
1VAR=R[r]=»;2VAR=+;BIN  ON  R[T] 
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500  MB  V  WIND;  FEB  MODEL  ON  APR  DATA;OBS  WIND 
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Figure  17C 


250  MB  U  WIND;APR  MODEL  ON  FEB  DATA;OBS  WIND 

1VAR=R[T]=<»:2VAR=  +  :BIN  ON  R[T] 
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Figure  19C 
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Figure  21C 


250  MB  V  WIND;APR  MODEL  ON  APR  DATA;OBS  WIND 
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250  MB  V  W1ND;APR  MODEL  ON  FEB  DATA;0B5  WIND 
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RED  MSE  PER  BIN 


APPENDIX  D.  GRAPHICAL  ASSESSMENT  OF  GOODNESS  OF  HT  AND 
CROSS-VALIDATION  OF  MODELS  FOR  FEBRUARY  AND  APRIL  WIND 
COMPONENT  MEAN  SQUARE  ERROR  USING  HRST-GUESS  WIND 

COVARIATES 

In  this  appendix  we  present  graphs  assessing  goodness  of  fit  and 

predictive  ability  of  the  normal  models  (l)-(3)  with  first-guess  wind 

covariates  fit  to  April  and  February  data. 

The  maximum  likelihood  parameter  estimates  for  each  model  (l)-(3)  are 

obtained  for  both  February  and  April  data  and  are  displayed  in  Table  7.  The 

2  2  2 

estimated  variances  o’2(0  are  computed  for  the  parameters 

estimated  from  February  and  April  data  using  (l)-(3)  for  each  data  point  in 
February  and  April. 

To  assess  models  (1)  and  (3)  the  data  iyit),  r(t),  s(t))  for  each  data  set  are 

binned  into  10  bins  based  on  ordering  the  values  of  r(f)  from  smallest  to 

largest.  The  data  in  the  first  bin  correspond  to  the  smaller  values  of  r(t);  the 

data  in  the  10**  bin  correspond  to  the  larger  values  of  r(t).  Each  bin  contains 

about  of  the  data  with  the  10**  bin  containing  a  few  more.  The  averages  of 

the  estimated  variances  for  models  (1)  and  (3)  are  computed  for  each  bin.  The 
2 

average  y(f)  is  also  computed  for  each  bin. 

To  assess  models  (2)  and  (3)  the  same  procedure  is  used  but  the  binning  is 
done  using  s(t). 

Figures  1D-24D  present  graphs  of  the  ln[average  y(t)  ]  in  each  bin  versus 
ln[average  estimated  variance]  in  each  bin  for  models  (1)  and  (3)  and  models 
(2)  and  (3).  Figures  ID,  5D,  9D,  13D,  17D,  21 D  (respectively  2D,  6D,  lOD,  14D, 
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18D  22D)  show  the  logarithm  of  the  average  of  the  y(/)^  values  for  February 
(respectively  April)  versus  the  logarithm  of  the  average  of  the  estimated 
variance  for  each  bin  using  the  estimated  parameters  from  February 
(respectively  April).  If  a  model  were  perfect,  a  point  should  be  close  to  the  45^ 
line  shown.  These  figures  are  an  indication  of  goodness  of  fit. 

Figures  3D,  7D,  IID,  15D,  19D,  23D  (respectively  4D,  8D,  12D,  16D,  20D, 
24D)  present  graphs  of  In  average  y(.tf  of  February  (respectively  April)  versus 
In  average  estimated  variances  using  parameters  estimated  using  April 
(respectively  February)  data.  Once  again  if  the  model  were  perfect,  the  points 
would  be  close  to  the  45°  line.  These  figures  assess  the  ability  of  models  fit 
using  February  (respectively  April)  first-guess  data  to  predict  Aprd 
(respectively  February)  wind  component  mean  square  error. 

The  figures  indicate  once  again  that  the  display  of  In  averages  can  be  quite 
sensitive  to  which  variate  is  used  to  do  the  binning. 

The  figures  indicate  the  following.  As  suggested  by  comparison  of  the  In 
likelihood  values,  of  Tables  6  and  8  for  models  with  observed  wind 
covariates  and  first  guess  wind  covariates,  the  figures  suggest  that  models 
using  first  guess  wind  covariates  do  not  describe  or  predict  mean  square  error 
for  wind  components  as  well  as  models  using  observed  wind  components. 
The  two-variate  model  appears  to  tend  to  produce  smaller  mean  square  errors 
than  the  one- variate  models;  this  tendency  is  most  striking  in  the  figure  with 
first  guess  wind  speed  being  used  as  the  single  covariate. 

The  models  fit  using  April  first  guess  data  appear  to  tend  to  be  better 
descriptive  and  predictive  models  than  those  fit  using  February  first  guess 
data. 


102 


The  figures  indicating  predictive  ability  (3D,  4D,  7D,  8D,  IID,  15D,  19D, 
20D,  23D  and  24D)  correspond  fairly  well  to  the  differences  between  the 
minimizing  value  of  ~t  for  the  models  with  covariates  and  the  value  of  ~l  for 
the  constant  model  (no  covariates)  in  the  corresponding  rows  of  Table  8.  If 
the  value  of  t  for  the  constant  model  is  larger  than  any  other  values  in  the 
row,  the  corresponding  figure  for  that  row  shows  no  association. 
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850  MB  U  WIND;APR  MODEL  ON  FEB  DATA;  1ST  GUESS 

ivar=r[t]=o;2var=+;BIN  on  r[t] 


106 


Figure  3D 


850  MB  U  WIND;FEB  MODEL  ON  APR  DATA;  1ST  GUESS  WIND 

1VAR=R[T3=»;2VAR=+:BIN  ON  R[T] 
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Figure  4D 


850  MB  V  WIND;FEB  MODEL  ON  FEB  DATA;1ST  GUESS  WIND 

ivar=r[t]=o;2var=+:Bin  on  r[t] 
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Figure  5D 


850  MB  V  WIND;APR  MODEL  ON  APR  DATA;  1ST  GUESS  WIND 

1VAR=»R[T]=.:2VAR=+;BIN  ON  R[T] 
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Figure  6D 


850  MB  V  W1ND;APR  MODEL  ON  FEB  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=o;2VAR=+;BIN  ON  R[T] 


110 


Figure  7D 


850  MB  V  W1ND;FEB  MODEL  ON  APR  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=«;2VAR=+:BIN  ON  R[T] 


Figure  8D 


500  MB  U  WlNDiFEB  MODEL  ON  FEB  DATA;  1ST  GUESS  WIND 

WAR=R[T]=o;2VAR=+;B1N  ON  R[T] 
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Figure  9D 


500  MB  U  WIND;APR  MODEL  ON  APR  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=o;2VAR=-r:BlN  ON  R[T] 
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igure  li 


500  MB  U  WIND;FEB  MODEL  ON  APR  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=.;2VAR=+;BIN  ON  R[T] 
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Figure  12D 


500  MB  V  WIND;FEB  MODEL  ON  FEB  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=o;2VAR=+:BIN  ON  R[T] 
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LN  AV  PRED  MSE  PER  BIN 
1VAR=WS[T]=o;2VAR=+:BIN  on  ws[t] 


500  MB  V  WIND;APR  MODEL  ON  FEB  DATA;  1ST  GUESS  WIND 
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Figure  15D 


500  MB  V  WIND;FEB  MODEL  ON  APR  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=»:2VAR=+:BIN  ON  R[T] 
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250  MB  U  WINDiPEB  MODEL  ON  FEB  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=»;2VAR=+;BIN  ON  R[T] 
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Figure  17D 


250  MB  U  WIND;APR  MODEL  ON  FEB  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=«:2VAR==4-;BIN  ON  R[T]. 
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250  MB  U  WiND:FEB  MODEL  ON  APR  DATA;  1ST  GUESS  WIND 

1VAR»R[T]=«;2VAR=+:BIN  ON  R[T3 
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Figure  20! 


250  MB  V  WIND-.FEB  MODEL  ON  FEB  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=‘>:2VAR=+:BIN  ON  R[T]' 
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Figure  21D 


250  MB  V  WIND:APR  MODEL  ON  APR  DATA;  1ST  GUESS  WIND 

1VAR»R[T]=o;2VAR=+;BIN  ON  R[T] 
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Figure  22D 
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250  MB  V  WIND;FEB  MODEL  ON  APR  DATA;  1ST  GUESS  WIND 

1VAR=R[T]=o:2VAR=+;BIN  ON  R[T] 
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